Classification from Pairwise Similarity and Unlabeled Data
نویسندگان
چکیده
One of the biggest bottlenecks in supervised learning is its high labeling cost. To overcome this problem, we propose a new weakly-supervised learning setting called SU classification, where only similar (S) data pairs (two examples belong to the same class) and unlabeled (U) data are needed, instead of fully-supervised data. We show that an unbiased estimator of the classification risk can be obtained only from SU data, and its empirical risk minimizer achieves the optimal parametric convergence rate. Finally, we demonstrate the effectiveness of the proposed method through experiments.
منابع مشابه
Multiclass Semi-supervised Boosting Using Different Distance Metrics
The goal of this thesis project is to build an effective multiclass classifier which can be trained with a small amount of labeled data and a large pool of unlabeled data by applying semi-supervised learning in a boosting framework. Boosting refers to a general method of producing a very accurate classifier by combining rough and moderately inaccurate classifiers. It has attracted a significant...
متن کاملSemi-supervised Multi-label Learning by Solving a Sylvester Equation
Multi-label learning refers to the problems where an instance can be assigned to more than one category. In this paper, we present a novel Semi-supervised algorithm for Multi-label learning by solving a Sylvester Equation (SMSE). Two graphs are first constructed on instance level and category level respectively. For instance level, a graph is defined based on both labeled and unlabeled instance...
متن کاملکاهش ابعاد دادههای ابرطیفی به منظور افزایش جداییپذیری کلاسها و حفظ ساختار داده
Hyperspectral imaging with gathering hundreds spectral bands from the surface of the Earth allows us to separate materials with similar spectrum. Hyperspectral images can be used in many applications such as land chemical and physical parameter estimation, classification, target detection, unmixing, and so on. Among these applications, classification is especially interested. A hyperspectral im...
متن کاملOn a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification
Pairwise clustering methods partition the data space into clusters by the pairwise similarity between data points. The success of pairwise clustering largely depends on the pairwise similarity function defined over the data points, where kernel similarity is broadly used. In this paper, we present a novel pairwise clustering framework by bridging the gap between clustering and multi-class class...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.04381 شماره
صفحات -
تاریخ انتشار 2018